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35 ABSTRACT 

36 Hydrologists and other users need to know the uncertainty of the satellite rainfall data sets across 

37 the range of time/space scales over the whole domain of the data set. Here, ‘uncertainty’ refers to 

38 the general concept of the ‘deviation’ of an estimate from the reference (or ground truth) where 

39 the deviation may be defined in multiple ways. This uncertainty information can provide insight 

40 to the user on the realistic limits of utility, such as hydrologic predictability, that can be achieved 

41 with these satellite rainfall data sets. However, satellite rainfall uncertainty estimation requires 

42 ground validation (GV) precipitation data. On the other hand, satellite data will be most useful 

43 over regions that lack GV data, for example developing countries. This paper addresses the open 

44 issues for developing an appropriate uncertainty transfer scheme that can routinely estimate 

45 various uncertainty metrics across the globe by leveraging a combination of spatially-dense GV 

46 data and temporally sparse surrogate (or proxy) GV data, such as the Tropical Rainfall 

47 Measuring Mission (TRMM) Precipitation Radar and the Global Precipitation Measurement 

48 (GPM) mission Dual-Frequency Precipitation Radar. The TRMM Multi-satellite Precipitation 

49 Analysis (TMPA) products over the US spanning a record of 6 years are used as a representative 

50 example of satellite rainfall. It is shown that there exists a quantifiable spatial structure in the 

51 uncertainty of satellite data for spatial interpolation. Probabilistic analysis of sampling offered by 

52 the existing constellation of passive microwave sensors indicate that transfer of uncertainty for 

53 hydrologic applications may be effective at daily time scales or higher during the GPM era. 

54 Finally, a commonly used spatial interpolation technique (kriging), that leverages the spatial 

55 correlation of estimation uncertainty, is assessed at climatologic, seasonal, monthly and weekly 

56 timescales. It is found that the effectiveness of kriging is sensitive to the type of uncertainty 

57 metric, time scale of transfer and the density of GV data within the transfer domain. Transfer 

58 accuracy is lowest at weekly timescales with the error doubling from monthly to weekly. 
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59 However, at very low GV data density (<20% of the domain), the transfer accuracy is too low to 

60 show any distinction as a function of the timescale of transfer. 

61 

62 Keywords: Satellite precipitation, uncertainty, transfer, spatial interpolation, GPM. 

63 
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64 


1.0 INTRODUCTION 


65 Precipitation is arguably one of the most important components of the water cycle over 

66 land. One study shows that almost 70-80% of the variability in the terrestrial water cycle can be 

67 explained from the spatio-temporal variability observed in precipitation over land (Syed et al., 

68 2004). Existing missions such as the Tropical Rainfall Measurement Mission (TRMM) provide 

69 vital precipitation information for water cycle studies (Huffman et al., 2007). Furthermore, 

70 planned missions such as the Global Precipitation Measurement (GPM) mission will provide a 

71 global hydrologic remote sensing observatory to advance the use of precipitation sensing 

72 technologies in scientific inquiry into hydrologic processes (Krajewski et al., 2006). With the 

73 global and more frequent precipitation observational capability planned for GPM, such 

74 precipitation measuring satellite missions permit us to refine knowledge from physical and 

75 hydrologic models that can then be converted to local and global strategies for water resources 

76 management (Voisin et al., 2008; Hossain et al., 2007). [Hereafter, because our focus is on 

77 liquid precipitation, the term ‘rainfall’ will be used as shorthand for ‘precipitation’ for 

78 convenience] 

79 However, a crucial challenge in advancing satellite rainfall-based surface hydrologic 

80 prediction, is the need to bridge the scale incongruity between overland hydrologic processes that 

81 evolve at small scales (i.e., < 1 hour and < 5 km) and operational satellite precipitation datasets 

82 that will always be restricted to coarser scales from passive microwave sensors (i.e., > 1 hour and 

83 >5 km; Hossain and Lettenmaier, 2006). There are two paths that have historically been 

84 followed as a response to this scale incongruity: 1) apply satellite rainfall data available at the 

85 native scale for hydrologic prediction (e.g., Harris and Hossain, 2008; Su et al., 2008; Voisin et 

86 al., 2008); and 2) apply spatial and spatio-temporal disaggregation (or downscaling) techniques 
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87 to resolve satellite rainfall data at the required smaller space-time scales for hydrologic 

88 prediction (e.g., Forman et al., 2009; Bindlish and Barros, 2000). Each option leads to non- 

89 negligible uncertainty in hydrologic simulation. In the first case, the major source of this 

90 uncertainty is due to the algorithmic and sampling uncertainty (for passive microwave-PMW 

91 sensors) of satellite rainfall data at the native scale. In the second case, the primary source of 

92 uncertainty is due to the statistical disaggregation technique that further propagates the native 

93 scale uncertainty to sub-grid uncertainty in ways that are not well understood (see for example, 

94 Rahman et al., 2009). Either way, hydrologists and other users, need to know the uncertainty of 

95 the satellite rainfall data sets across the range of time/space scales over the whole domain of the 

96 data set. This uncertainty can provide insightful information to the user on the realistic limits of 

97 utility that can be achieved with satellite rainfall data sets, for example for hydrologic 

98 predictability (Hong et al., 2006), on which we will focus in this paper. While representing the 

99 uncertainty structure of satellite rainfall as a function of scale against quality-controlled ground 

100 validation datasets remains a critical research problem for GPM, therein lies a paradox. Satellite 

101 rainfall uncertainty estimation requires ground validation (GV) precipitation data. On the other 

102 hand, satellite data will be most useful over ungauged regions in the developing world (Tang and 

103 Hossain, 2009). 

104 In-situ rainfall information from rain gauge networks is generally considered the standard 

105 choice for GV data (Villarini and Krajewski, 2007; Habib et al., 2004; McCollum et al., 2002). 

106 Such data is often referred to as ‘reference’ or ‘truth’. However, in-situ gauges are point 

107 measurements and unless there exists a dense network to adequately capture the space-time 

108 variability of rainfall process, its use for validating areal -averaged satellite rainfall data for 

109 surface hydrologic processes remains questionable (Ciach and Krajewski, 1999). The work of 
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Gebremichael et al. (2003) clearly demonstrates the sensitivity of satellite rainfall uncertainty 
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estimation to gauge density. Thus, in most regions across the globe without adequate in-situ rain 
gauge coverage, the uncertainty associated with satellite data have been parameterized to 
sampling configuration of the sensors at this stage (Li et al., 1998; Huffman, 1997). Some 
examples of this parameterization are the Global Precipitation Climatology Project (GPCP; 
Huffman, 2005; Huffman et al., 1997) dataset and the TRMM Multi-satellite Precipitation 
Algorithm (TMPA; Huffman et al., 2007) that now provide an estimate of the Root Mean- 
Squared Uncertainty (RMSE) of the satellite rainfall estimates on the basis of sampling pattern 
and the period of rainfall accumulation of interest to the user. 

While such parameterized methodologies for estimating uncertainty have been useful in 
providing users with a level of confidence associated with satellite rainfall estimates, such 
uncertainty is essentially a standard deviation measure of sampling uncertainty. Many of these 
uncertainty methodologies are based on the conceptual argument that uncertainty (i.e, standard 
deviation, oe) can be related directly or inversely to observation interval (At), observation period 
(T), spatial averaging area (A), and rain rate (R): 


/ 1 1 At 

CFj, = / — , — , — , parameters 

E U A T 


( 1 ) 


as expressed by Steiner et al. (2003), among others. In many cases, the functional form of this 
‘predicted’ uncertainty is not benchmarked to the realities of the ground observations and hence 
may not provide a reasonable assessment in indicating the expected reliability for water cycle 
studies (Gebremichael et al., 2010). Recently, several other parameterized methodologies have 
evolved based on data assimilation approaches (e.g. Kalman filtering in GsMAP satellite product 
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133 of Ushio et al., 2009). In these approaches, an estimate of uncertainty that is available is 

1 34 essentially related to the methodology of the filtering technique and does not necessarily indicate 

135 the actual level of agreement with GV rainfall data. In some instances, however, the uncertainty 

136 is estimated by comparing the output of a wide-coverage technique (such as infrared-IR advected 

137 PMW) to a more localised but higher accuracy product (such as PMW only; Ushio et al., 2009). 

138 There now exists a sufficient body of knowledge on uncertainty metrics and models that 

139 we should consider a transition to a more hydrologically-relevant framework in anticipation of 

140 the satellite data-rich scenario of GPM. Although existing uncertainty metrics and uncertainty 

141 models represent an important first step, most treat uncertainty as a single measure representative 

142 for a large space and time domain. This uni-dimensional uncertainty measure is invariably the 

143 standard deviation of uncertainty (e.g. Eqn. 1). However, a satellite rainfall product with an 

144 uncertainty standard deviation (o E ) of X mm/hr over a large space-time domain can be 

145 represented by a multiplicity of distinct spatio-temporal patterns of rainfall, each having a 

146 distinct response in surface hydrology (see for example, Lee and Anagnostou, 2004). 

147 What is therefore needed now for advancing the hydrological application of GPM is a 

148 practical methodology that can routinely ‘transfer’ a set of hydrologically-relevant uncertainty 

149 metrics from locations/regions having GV-based values to ungauged regions for improving water 

150 cycle studies or water resources management. Here, ‘transfer’ is akin to spatial interpolation at 

151 non-sampled locations (grid boxes) using measurements from sampled but sparse locations (grid 

152 boxes). Figure 1 provides a conceptual rendition of this idea of ‘transfer’ of uncertainty based on 

153 the concept of spatial interpolation ( taken from Ling and Hossain, 2009). 

154 This paper analyzes the open issues for developing an appropriate uncertainty transfer 

155 scheme that can routinely estimate various uncertainty metrics across the globe by leveraging a 
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156 combination of spatially-dense GV data and temporally sparse surrogate (or proxy) GV data 

157 from sources such as the TRMM-like PR sensor anticipated during the GPM era. The TRMM 

158 Multi-satellite Precipitation Analysis (TMPA) products 3B42RT and 3B41RT (Huffman et al., 

159 2007) over the US spanning a record of 6 years are used as a representative example of satellite 

160 rainfall. The paper presents a probabilistic analysis of sampling offered by the existing 

161 constellation of precipitation-relevant satellite PMW sensors in order to understand the current 

162 and expected spatial coverage during the GPM era. A commonly used spatial interpolation 

163 technique (kriging), that leverages the spatial correlation of rainfall estimation uncertainty, is 

164 then investigated for its effectiveness. This effectiveness is cast in the context of the expected 

165 sparseness in GV data expected from TRMM and GPM missions. Finally, important issues 

166 needing closure are summarized on the basis of our investigation of transfer of satellite rainfall 

167 uncertainty from GV to non-GV regions. To avoid confusion among readers, hereafter, the terms 

168 ‘uncertainty’ or ‘uncertainty metric’ will be used to define the quality indices of the satellite 

169 rainfall estimate derived at GV locations (such as bias, root mean squared error, probability of 

170 detection). The terms ‘error’ or ‘transfer error’ will be used specifically to define the quality of 

171 the transfer (spatial interpolation) process of uncertainty metrics at non-GV locations. 

172 

1 73 2.0 SPATIAL CORRELATION OF SATELLITE RAINFALL UNCERTAINTY 

174 The very first requirement for an effective transfer (spatial interpolation) scheme is the 

175 presence of a quantifiable spatial structure (or spatial correlation) in the variable being 

176 transferred. Therefore, we first investigated the presence of spatial correlation of satellite rainfall 

1 77 uncertainty. First, in order to minimize the error of the GV rainfall data, we used the National 

178 Center for Environmental Prediction’s (NCEP) 4 km Stage IV NEXRAD rainfall data that is 

179 adjusted to precipitation gages and conveniently available as a quality-controlled data mosaic 


7 



180 over the U.S. (Lin and Mitchell, 2005; Fulton et al., 1998). TMPA’s near real-time satellite 

181 rainfall data-products from PMW-calibrated Infrared (IR) and merged PMW-IR estimates 

182 (labeled 3B41RT and 3B42RT, respectively; Huffman et al., 2007) were used as the satellite 

183 rainfall data. The data for GV and satellite rainfall data spanned 6 years from 2002 to 2007. The 

184 NEXRAD Stage IV GV rainfall data were first remapped to 0.25° 3 -hourly resolution for 

185 consistency with the native scale of the satellite rainfall products. 3B41RT data were also 

186 remapped at the 3-hourly time scale. After a thorough quality assessment and quality control 

187 (QA/QC), the datasets were organized by season and various regions for the years 2002-2007. 

188 In order to study how the satellite rainfall uncertainty is spatially dependent (or 

189 correlated), Tang and Hossain (2009) derived the spatial correlograms for each uncertainty 

190 metric using the TMPA dataset described above. Herein, the correlation length (CL), where the 

191 autocorrelation dropped to 1/e (e-folding distance), was first computed. Next, the empirical semi- 

192 variograms were derived and then idealized as exponential semi -vario gram functions, 

193 y(h) = c 0 + c (1 - e~ h,a ) (2) 

194 where y(h) is the semi-variance at spatial lag ‘IT, Co represents the nugget variance (i.e., the 

195 minimum variability observed or the ‘noise’ level at a separation distance of 0); c is the sill 

196 variance (when spatial lag is infinite); and a is the correlation length. Figure 2 provides a 

197 summary of the ‘climatologic’ correlation length (e-folding distance) by season for various 

198 uncertainty metrics of the satellite rainfall products such as Probability of Detection (POD) for 

199 rain, POD for no rain, false alarm ratio (FAR), root mean squared error (RMSE), and bias. 

200 Herein, ‘climatologic’ refers to the mean error derived from the entire 6 year of data. Appendix 

201 one provides the mathematical formulation for the error metrics. 
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202 Figure 2 clearly demonstrates that, at the climatologic (long-term) time scale, satellite 

203 rainfall uncertainty can have distinct spatial organization that can be leveraged for spatial 

204 interpolation. The correlation lengths for a given uncertainty metric as a function of season 

205 appear to be at least 3-5 (0.25°) TMPA grid boxes long. As a rule of thumb, this indicates that 

206 the transfer of error from sampled locations may be effective up to 4 grid boxes (~ 100 km) 

207 away. Another interesting feature that is revealed in this figure is the significantly higher 

208 correlation lengths (and spatial organization) observed for 3B41RT than 3B42RT. This can be 

209 traced to the sources of the specific satellite estimates: 3B41RT is uniformly computed using a 

210 calibration of infrared (IR) brightness temperatures to a combined PMW estimate. The statistics 

211 of the uncertainty are spatially very homogenous since they originate from a single probability 

212 distribution at regional scales. On the other hand, 3B42RT uses a variety of PMW rainfall 

213 estimates with gaps filled during a 3-hour sampling period with the 3B41RT estimate ‘as is’. 

214 This fill-in causes the 3B42RT data to draw on two different probability distributions in space 

215 for uncertainty statistics (IR and PMW); the increased spatial heterogeneity in the uncertainty 

216 structure leads to shorter correlation length. This analysis shows that any uncertainty transfer 

217 scheme should benefit from improvements in the 3B42RT product to make it statistically more 

218 homogenous in space. 

219 

220 3.0 SPATIAL COVERAGE OFFERED BY CURRENT CONSTELLATION OF PMW 

221 SENSORS 

222 

223 Having observed a distinct spatial organization of uncertainty, we also need to understand 

224 the space/time dimension that is implicit in the concept of real-time uncertainty ‘transfer’ over 

225 non-GV regions. The space dimension pertains to the regions with spatially sparse GV data due 

226 to inadequate in-situ gauge data (such as that shown in Figure 1), which are, of course, recorded 
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227 at fixed positions. The time dimension pertains to the temporally sparse case of using the most 

228 accurate rainfall source currently available from space, such as the orbiting TRMM PR as 

229 ‘proxy’-GV data, over regions where there is no ground-based GV data. Depending on how we 

230 define GV data, there can be several types of GV ‘voids’ where uncertainty infonnation will be 

231 need to be estimated for GPM. For example, if we rely on the ‘conventional’ ground source for 

232 GV data, voids will be represented by large and stationary regions having little or no 

233 instrumentation. On the other hand, if a ‘proxy’ for GV is defined from orbiting sensors, such as 

234 the TRMM PR, or even a highly accurate PMW sensor, then voids will be numerous grid boxes 

235 dynamically changing in location with each satellite orbit. 

236 The left panel of Figure 3 shows the probability of a 3B42RT grid box (0.25°) having a 

237 conical-scanning PMW overpass (comprising either TMI, SSMI or AMSR) in 3, 6 and 24 hour 

238 windows. On the right panels of Figure 3, the probability of a 3B42RT grid box having a TRMM 

239 Microwave Imager (TMI) scan is shown. These probability maps were created using a 100 day 

240 period of any PMW sensor) from the 2007- 2008 period. It is clear from the maps that the spatio- 

241 temporal dynamics of the location of PMW scans is strongly sensitive to the accumulation 

242 periods of hydrologic relevance and one that must be investigated carefully in order to identify 

243 how an uncertainty transfer scheme may work using proxy-GV data. 

244 At time scales of 3-6 hours, there are vast regions lacking conventional surface GV data 

245 in the tropics of Africa, Asia and South America where the probability of having a PMW scan is 

246 less than 50% (Figure 3). This makes the estimation of uncertainty through transfer from GV 

247 regions at these locations more important for hydrologic applications. While GPM may improve 

248 the coverage of PMW scans, such large voids with a low probability for a PMW scan will still 

249 remain over these regions due the continued dependence on polar-orbiting sensors. Since gauge- 
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250 based GV is sparse for these tropical regions at hydrologic scales, proxy GV data from space- 

251 borne sensors (such as that expected from the GPM Dual Frequency Precipitation Radar) may be 

252 one of the few ways to explore if ‘transfer’ of uncertainty is realistic. For the higher latitude land 

253 regions (which comprise mostly the industrialized world with reasonably gauged fixed-location 

254 GV instrumentation), the uncertainty could be transferred from the stationary GV regions. The 

255 right panels of Figure 3 also show that the probability of having a TMI scan also happens to be 

256 lowest (0. 1-0.2 in 3 hours) over the tropical regions. However, over a 24 hour time period there 

257 is considerably higher probability of having such a scan (-0.7-0.8). This implies that the 

258 practicable timescale for transferring uncertainty metrics over the tropics from a sun- 

259 asynchronous PMW sensor is at least 24 hours. 

260 

261 4.0 TRANSFER OF UNCERTAINTY BY SPATIAL INTERPOLATION 

262 Tang and Hossain (2009) recently showed that most uncertainty metrics (such as bias and 

263 POD) are amenable to ‘transfer’ from gauged to ungauged locations using spatial interpolation at 

264 climatologic (six-year average) timescales. The method of ordinary kriging (OK) was used for 

265 testing the ‘transfer’ of uncertainty metrics. OK is the most common (and one of the simplest) 

266 spatial interpolation estimator used to find the best linear unbiased estimate of a second-order 

267 stationary random field with an unknown constant mean as follows: 

268 Z(x 0 ) = Y t Z,Z(x l ) (3) 

1=1 

269 where Z(x 0 ) = kriging estimate at location xo; Z(x 7 ) = sampled value at location x 7 ; = 

270 weighting factor for Z(x 7 ) (summing to one over all i), and n is the number of sampled (known) 

271 locations. Kriging methods have already been used for spatial interpolation of precipitation from 

272 point gauge data with considerable success (see for example, Seo et al., 1990; Krajewski, 1987). 
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273 Using the same six-year database of high resolution satellite rainfall data from TMPA 

274 over the central US, the OK method was applied to assess the effectiveness of transfer of 

275 uncertainty metrics from GV to non-GV grid boxes, using correlation as the main assessment 

276 metric. Assuming that only 50% of the region (i.e., grid boxes) was gauged (i.e., having access to 

277 GV data), OK was implemented to estimate uncertainty metrics at the other 50% of the (non- 

278 GV) region. Selection of ‘GV’ grid boxes was random and hence each kriging realization was 

279 repeated 10 times in a Monte Carlo (MC) fashion to derive an average scenario. The semi- 

280 variogram and correlation lengths were computed on the basis of the 50% of the assumed 

281 ‘available’ data. Spatial correlograms for each uncertainty metric were derived and the 

282 correlation length (CL), where the autocorrelation dropped to 1/e (e-folding distance), was 

283 computed. The empirical semi-variograms were derived and then idealized as exponential semi- 

284 variogram functions. 

285 Tang and Hossain (2009) showed that the transfer of uncertainty metrics using kriging 

286 did not lead to wholesale changes in the pattern of the uncertainty field when compared to the 

287 true climatologic uncertainty field (see upper left and upper right panels of Figure 4a). Overall, 

288 their assessment indicated that ‘transfer’ uncertainty metrics from a gauged to an ungauged 

289 location through spatial interpolation has merit for selected uncertainty metrics. In Figure 4b, the 

290 histograms for ‘kriging error’ and actual uncertainty (over ungauged grid boxes) demonstrate the 

291 accuracy of the transfer method for FAR. Here, the ‘kriging error’ refers to the difference 

292 between ‘kriged uncertainty’ and ‘actual uncertainty’, whereas the ‘actual uncertainty’ is the 

293 ‘measured uncertainty’. In other words, the ‘kriging error’ is the estimation uncertainty while the 

294 ‘actual uncertainty’ is the true dataset uncertainty. The actual GV-based uncertainty (i.e., FAR in 

295 this case) is shown in pink while the black line represents the histogram for kriging error. The 
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296 histograms for kriging error are considerably lower, by almost an order of magnitude, compared 

297 to the actual GV-based uncertainty and are ahnost unbiased. 

298 However, a point to note is that the work of Tang and Hossain (2009) demonstrated the 

299 utility of transfer only at the climatologic time scales with a high degree of GV coverage (50%). 

300 Also, at the climatologic scales, the spatial structure of uncertainty can be expected to be well 

301 defined and reasonably homogenous (longer correlation lengths of uncertainty that lead to high 

302 accuracy for kriging; see Figure 2). Furthermore, the use of the correlation measure may not 

303 necessary reflect the most rigorous assessment of accuracy for the transfer of error metrics. For 

304 example, there may be high correlation even with large systematic bias in the ‘kriged’ error 

305 metric at non-GV grid boxes. In this study, we therefore explored the effectiveness of kriging at 

306 seasonal (and lower) timescales and modeled how the effectiveness of transfer is impacted by 

307 GV data coverage. We also assessed the accuracy of transfer using marginal and non -correlation 

308 type measures. 

309 Figure 5 shows how GV coverage (as randomly located grid boxes over a region) impacts 

310 the accuracy of kriging-based transfer of uncertainty over the grid boxes lacking GV data for two 

311 different time scales (climatologic and seasonal in the left and right panels, respectively). The 

312 exercise was performed in a manner similar to Tang and Hossain (2009) over the central United 

313 States. The GV coverage was systematically varied from 10% to 90% and the effectiveness of 

314 kriging of uncertainty metric at locations lacking GV data was assessed using the correlation 

315 measure with in-situ (sampled) uncertainty metric. For the seasonal case, the summer months of 

316 June- July- August in 2007 over the central US was chosen as an example and one seasonal 

317 variogram was modeled. 
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318 The most striking feature of the GV-density study is that the effectiveness of an 

319 uncertainty transfer scheme, specifically kriging in this example, worsens considerably at low 

320 GV coverage (correlation dropping to under 0.7) as time scales shorten. Qualitatively, this result 

321 is expected, and clearly indicates that if a transfer scheme for estimating uncertainty metrics is 

322 finer than seasonal scale (ranging from 3-6 hourly to weekly-monthly), the effectiveness for 

323 uncertainty transfer would intuitively worsen further with kriging. A similar assessment can be 

324 made from Figure 3 on the potential of kriging using dynamically located a sun-asynchronous 

325 PMW scans (such as TMI) as proxy-GV data. At 3-6 hours, the probability of a grid box being 

326 scanned by a sun-asynchronous TMI ranges from 0.1 to 0.4. In other words, this is equivalent to 

327 a large region having a fixed GV coverage of 10-40%. Naturally therefore, the effectiveness of 

328 OK over the tropics using proxy-GV data in the GPM era may probably not be any better at 

329 timescales shorter than a day. 

330 In order to demonstrate a more rigorous level of accuracy of the interpolation scheme 

331 beyond the correlation measure, we reviewed our kriging simulations for two more scenarios: 1) 

332 transfer of uncertainty metric at monthly time scales and 2) transfer of uncertainty metric at 

333 weekly time scales. For each scenario, we performed a more in-depth assessment for the months 

334 of summer and weeks of June of 2007. We used mean error, in place of correlation measure, to 

335 assess the accuracy of the transfer, using the following error definitions: 

336 

337 Error = (Interpolated Uncertainty Metric - Actual Uncertainty Metric)/ Actual Uncertainty Metric 

338 (4) 

339 Mean Error= Mean of Error (as defined above in Eqn 4) over non-GV grid boxes (5) 

340 Std. Dev of Error = Standard Deviation of Error (as defined above) over non-GV grid boxes (6) 
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341 Tables la and lb summarize the assessment of OK method using mean relative error 

342 (Eqn 5) as the main assessment metric for transfer of uncertainty metrics - BIAS, RMSE, POD 

343 and FAR. For each uncertainty metric both mean and standard deviation of error of transfer is 

344 shown as measures of accuracy and precision, respectively. It is observed that, unlike correlation 

345 measure, the mean and standard deviation of error reveal a somewhat different picture on the 

346 utility of OK method. The error metric BIAS has the lowest accuracy ranging from 50% error (at 

347 10% missing GV gridboxes) to 100% error (at 90% missing GV gridboxes) for monthly time 

348 scale. For weekly timescale, the mean error ranges from 80% (at 10% missing GV grid boxes) to 

349 120% (at 90% missing GV grid boxes). On the other hand, POD, followed by RMSE and FAR, 

350 have the highest accuracy for transfer of error metrics at both timescales according the mean 

351 relative error measure. As expected, the precision of the kriging based transfer scheme degrades 

352 at shorter timescales. At very low GV coverage (<20%) the standard deviation of transfer error is 

353 high (>100%), indicating poor performance of the OK method regardless of the timescale at 

354 which the uncertainty metrics are transferred. 

355 

356 5.0 CONCLUSION: THE CURRENT OPEN ISSUES ON UNCERTAINTY TRANSFER 

357 In light of the impact of GV coverage and the timescale on effectiveness of uncertainty 

358 transfer, we now need to consider the following: 1) explore other techniques for transfer that are 

359 more sophisticated than ordinary kriging; and 2) understand how we can leverage the 

360 methodological error estimate that is routinely available from uncertainty models such as 

361 Huffman (1997) and Kalman filtering techniques that many satellite rainfall data algorithms use 

362 (such as, Ushio et al. 2009). In our interpolation method, the spatial structure of rainfall has not 

363 been used alongside that of estimation uncertainty. Because the two (rainfall and its estimation 
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364 uncertainty) are related, it may be worthwhile to pursue co-kriging type conditional interpolation 

365 schemes that leverage existing information on the satellite rainfall distribution as an extra 

366 constraint. 

367 Also, for spatial interpolation methods, we should keep in mind that traditional 

368 geostatistical tools are pattern filling methods based on the spatial correlation exhibited by two 

369 points in space separated by a lag h. The variogram computed using this two-point geostatistical 

370 approach may simplify the spatial patterns manifested by the complex precipitation systems and 

371 surface emissivities that dictate the accuracy of satellite rainfall products at hydrologic time 

372 scales over land. For the case of spatial interpolation of ground water contamination, it has 

373 recently been demonstrated that the use of a highly non-linear pattern learning technique in the 

374 form of an artificial neural network (ANN) can yield significantly superior results under the 

375 same set of constraints when compared to ordinary kriging method (Chowdhury et al., 2009). 

376 Thus, the use of non-linear mapping techniques are worth an investigation. 

377 Another aspect to keep in mind is the nature of use of each uncertainty metrics. Different 

378 users will have naturally different needs. Hydrologist users engaged in flash flood or monsoonal 

379 flood forecasting will probably be more interested in the PODrain (to understand the accuracy in 

380 estimating peak flow), FAR (to minimize false alarms in flood warnings), and BIAS (to 

381 minimize under/over estimation in river stage) for each grid box (see Harris and Hossain, 2008 

382 and Hossain and Anagnostou, 2004). Hydrologists engaged in continuous simulation based on 

383 soil moisture accounting for drought monitoring and water management would probably focus 

384 more on PODnorain (to minimize uncertainty in underestimating the soil wetness and 

385 evapotranspiration) for each grid box. On the other hand, crop yield and famine forecasters 

386 would like to focus more on the seasonal bias over a large agricultural zone during the growing 
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387 season as the important indicator of reliability of a satellite rainfall product (personal 

388 communication with Dr. Chris Funk of University of California Santa Barbara). 

389 In summary, developing an uncertainty transfer scheme that is amenable to operational 

390 implementation for estimation of uncertainty metrics for satellite rainfall data over regions 

391 lacking surface GV data is a necessary requirement for current and future satellite precipitation 

392 missions to advance their hydrologic potential. Hydrologist users around the world need to have 

393 a clear understanding of the pros and cons of applying satellite rainfall data for terrestrial 

394 hydrologic applications at a given scale if the benefit of these missions is to be maximized. One 

395 way of facilitating the understanding is through the routine provision of various measures of 

396 uncertainty that are of hydrologic relevance. If this uncertainty information is provided alongside 

397 the global and more frequent precipitation observational capability planned in GPM, it will 

398 permit us to refine knowledge from physical and hydrologic models that can then be converted to 

399 local and global strategies for water resources management. Work is currently undergoing to 

400 address some of the open issues discussed above and we hope to report them in the near future. 

401 
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408 APPENDIX: FORMULATION OF UNCERTAINTY METRICS 

409 

410 Consider the 2x2 contingency table of hits and misses associated with satellite rainfall 

411 estimates: 

412 

413 TABLE A. 1 

414 

415 

N 

416 Probability of Detection for Rain (PODrain)- - — (A.l) 

n a +n b 

N 

417 Probability of Detection for No Rain (PODnorain): - — (A. 2) 

N d +N c 

N 

418 False Alarm Ratio (FAR): (A. 3) 

419 The PODrain essentially defines how often a satellite rainfall estimate is likely to correctly 

420 detect gridboxes as rainy according to the reference or ground validation data. Similarly, 

421 POD N orain defines how often a satellite rainfall estimate is likely to correctly detect a non-rainy 

422 grid box as non-rainy according to the ground validation data. 

423 
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LIST OF FIGURE CAPTIONS 


526 Figure 1. Conceptual rendition of the idea of ‘transfer’ of uncertainty information from a gaged 

527 (GV) location to an ungaged (non-GV) location. Upper panel depicts the notion of ‘uncertainty’ 

528 of satellite rainfall data (in this case, the scalar deviation of magnitudes is termed ‘uncertainty’ 

529 although there are many other types of uncertainty). Lower panel depicts how the known 

530 uncertainty (derived from GV sites shown in black in the middle panel) would be ‘transferred’ to 

531 the non-GV (ungaged) sites shown in blue (right most panel). Reprinted from Tang and Hossain, 

532 2009. 

533 

534 Figure 2. Correlation length of uncertainty metrics at climatologic timescales for 3B41RT 

535 (upper panel) and 3B42RT (lower panel) shown as a function of season. Note the distance unit is 

536 0.25 degree grid boxes (~ 25 km). The vertical bars are shown in order from left to right as 

537 ‘Bias’, ‘RMSE’, ‘POD rain’, ‘POD no-rain’, ‘FAR’. ( Taken from Tang and Hossain, 2009). 

538 

539 Figure 3. Left panels: probability of a 3B42RT 0.25° grid box having a PMW scan from either 

540 TMI, SSMIs or AMSR for 3 hour (upper), 6 hour (middle), and 24 hour periods (bottom). Right 

541 panels: Same as left panels but only for TMI. 

542 

543 Figure 4a. Transfer of Bias of 3B41RT from gauged to ungauged locations. Upper left-most 

544 panel shows the true field of uncertainty in bias based on 6 years of data. The lower left-most 

545 panel is the randomly selected 50% of the region for computation of the empirical variogram and 

546 correlation length. The lower middle panel shows the other 50% of the region that was assumed 

547 have no GV. Lower right panel shows the estimation of the bias at the non-GV grid boxes using 

548 ordinary kriging. 

549 

550 Figure 4b. Histograms of kriging error and actual error for false alarm ratio (FAR) over 

551 ungauged gridboxes. Here kriging error (shown in black) is defined as the difference between 

552 transferred (or kriged) FAR and the actual FAR derived from GV data. The actual GV -based 

553 FAR is shown in pink. 

554 

555 Figure 4b. Histograms of kriging error and actual error for false alarm ratio (FAR) over 

556 ungauged gridboxes. Here kriging error (shown in black) is defined as the difference between 

557 transferred (or kriged) FAR and the actual FAR derived from GV data. The actual GV -based 

558 FAR is shown in pink. 

559 

560 Figure 5. Impact of GV coverage (or sparseness) on the effectiveness of uncertainty metric 

561 transfer by ordinary kriging at climatologic scale (upper panel) and seasonal scale (lower panel) 

562 for the central US. . Computed with TMPA data collected for a 100 day period (May-August) in 

563 2007-2008. 
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Figure 1 . Conceptual rendition of the idea of ‘transfer’ of uncertainty information from a gaged 
(GV) location to an ungaged (non-GV) location. Upper panel depicts the notion of ‘uncertainty’ 
of satellite rainfall data (in this case, the scalar deviation of magnitudes is tenned ‘uncertainty’ 
although there are many other types of uncertainty). Lower panel depicts how the known 
uncertainty (derived from GV sites shown in black in the middle panel) would be ‘transferred’ to 
the non-GV (ungaged) sites shown in blue (right most panel). Reprinted from Tang and Hossain, 
2009 . 
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Figure 2. Correlation length of uncertainty metrics for 3B41RT (upper panel) 
and 3B42RT (lower panel) shown as a function of season. Note the distance 
unit is 0.25 degree grid boxes (~ 25 km). The vertical bars are shown in order 
from left to right as ‘Bias’, ‘RMSE’, ‘POD rain’, ‘POD no-rain’, ‘FAR’. 
(Taken from Tang and Hossain, 2009). 
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Figure 3. Left panel: probability of a 3B42RT 0.25° grid box having a PMW scan from either TMI, 
SSMIs or AMSR for 3 hour (upper), 6 hour middle) and 24 hour periods (bottom). Right panel: 
Same as left panel but only for TMI. These probability maps were created using a 100 day period of 
any PMW sensor from the 2007- 2008 period. 
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584 Figure 4a. Transfer of Bias of 3B41RT from gauged to ungauged locations. Upper leftmost 

585 panel shows the true field of uncertainty on bias based on 6 years of data. The lower left most 

586 panel is the randomly selected 50% of the region for computation of the empirical variogram and 

587 correlation length. The lower middle panel shows the other 50% of the region that was assumed 

588 to be non-GV grid boxes. Lower right panel shows the estimation of the bias at the non-GV grid 

589 boxes using ordinary kriging. 
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Figure 4b. Histograms of kriging error and actual uncertainty for false alarm ratio (FAR) over 
ungauged gridboxes. Here kriging error (shown in black) is defined as the difference between 
transferred (or kriged) FAR and the actual FAR derived from GV data. The actual GV-based 
FAR is shown in pink. 
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Table la. Assessment of the transfer of uncertainty metrics at monthly time scales (for summer 
months of June- July- August). 


MONTH 


August 


% of 

region 

lacking 

GV 

data 


BIAS 

RMSE 

POD 

FAR 

Mean 

Std. 

Mean 

Std. 

Mean 

Std. 

Mean 

Std. 

Error 1 

Dev of 

Error 

Dev 

Error 

Dev of 

Error 

Dev of 


Error 2 


Of 


Error 


Error 




Error 





0.53 

0.78 

0.19 

0.18 

0.12 

0.10 

0.22 

0.27 

0.64 

0.86 

0.22 

0.29 

0.13 

0.13 

0.23 

0.25 

0.60 

0.93 

0.21 

0.24 

0.13 

0.12 

0.24 

0.31 

0.66 

1.00 

0.22 

0.24 

0.13 

0.14 

0.22 

0.26 

0.68 

1.08 

0.23 

0.20 

0.14 

0.16 

0.24 

0.37 

0.73 

1.12 

0.25 

0.27 

0.14 

0.16 

0.25 

0.33 

0.75 

1.11 

0.26 

0.25 

0.15 

0.15 

0.26 

0.35 

0.85 

1.24 

0.27 

0.30 

0.16 

0.16 

0.28 

0.40 

1.00 

1.43 

0.31 

0.29 

0.19 

0.23 

0.31 

0.47 

Mean 

Std. 

Mean 

Std. 

Mean 

Std. 

Mean 

Std. 

Error 

Dev of 

Error 

Dev 

Error 

Dev of 

Error 

Dev of 


Error 


Of 


Error 


Error 




Error 





0.54 

0.71 

0.20 

0.20 

0.16 

0.15 

0.32 

0.43 

0.64 

0.98 

0.22 

0.22 

0.17 

0.18 

0.27 

0.31 

0.67 

0.98 

0.22 

0.21 

0.16 

0.19 

0.28 

0.32 

0.61 

0.85 

0.23 

0.24 

0.18 

0.22 

0.31 

0.41 

0.67 

1.01 

0.25 

0.24 

0.18 

0.21 

0.31 

0.39 

0.72 

1.09 

0.26 

0.27 

0.17 

0.19 

0.32 

0.40 

0.80 

1.13 

0.26 

0.24 

0.19 

0.24 

0.33 

0.44 

0.87 

1.32 

0.30 

0.34 

0.21 

0.25 

0.35 

0.48 

0.99 

1.46 

0.33 

0.34 

0.24 

0.30 

0.39 

0.51 

Mean 

Std. 

Mean 

Std. 

Mean 

Std. 

Mean 

Std. 

Error 

Dev of 

Error 

Dev 

Error 

Dev of 

Error 

Dev of 


Error 


Of 


Error 


Error 




Error 





0.47 

0.79 

0.21 

0.22 

0.14 

0.17 

0.28 

0.47 

0.56 

0.84 

0.24 

0.32 

0.16 

0.19 

0.28 

0.41 

0.52 

0.78 

0.23 

0.28 

0.15 

0.17 

0.28 

0.35 

0.62 

1.00 

0.24 

0.30 

0.15 

0.17 

0.27 

0.34 

0.62 

0.95 

0.25 

0.34 

0.16 

0.21 

0.27 

0.35 

0.70 

1.09 

0.26 

0.30 

0.17 

0.21 

0.27 

0.34 

0.69 

1.05 

0.30 

0.36 

0.17 

0.20 

0.29 

0.40 

0.78 

1.16 

0.36 

0.49 

0.18 

0.20 

0.31 

0.43 

0.83 

1.19 

0.35 

0.42 

0.21 

0.23 

0.33 

0.50 


Error = (Interpolated Uncertainty Metric - Actual Uncertainty Metric)/Actual Uncertainty Metric 
L Mean Error= Mean of Error (as defined above) over non-GV grid boxes 

9 

Std. Dev of Error = Standard Deviation of Error (as defined above) over non-GV grid boxes 
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Table lb. Assessment of transfer of uncertainty metric at weekly time scales (for June weeks) 


Week 

(of 

June) 

% of 

region 

lacking 

GV 

data 

BIAS 

RMSE 

POD 

FAR 

1 st 


Mean 

Std. 

Mean 

Std. 

Mean 

Std. 

Mean 

Std. 

Week 


Error 1 

Dev of 

Error 

Dev 

Error 

Dev of 

Error 

Dev of 




Error 2 


Of 


Error 


Error 






Error 






10 

0.78 

1.13 

0.45 

0.73 

0.27 

0.29 

0.35 

0.32 


20 

0.88 

1.17 

0.46 

0.68 

0.28 

0.29 

0.35 

0.32 


30 

0.87 

1.16 

0.41 

0.53 

0.31 

0.34 

0.36 

0.33 


40 

0.86 

1.15 

0.48 

0.76 

0.33 

0.35 

0.35 

0.31 


50 

0.97 

1.25 

0.41 

0.54 

0.33 

0.35 

0.34 

0.30 


60 

0.93 

1.15 

0.53 

0.83 

0.33 

0.36 

0.38 

0.34 


70 

1.05 

1.35 

0.52 

0.80 

0.34 

0.36 

0.38 

0.32 


80 

0.98 

1.15 

0.56 

0.81 

0.37 

0.38 

0.38 

0.33 


90 

1.15 

1.37 

0.74 

1.06 

0.39 

0.38 

0.40 

0.38 

“r* 


Mean 

Std. 

Mean 

Std. 

Mean 

Std. 

Mean 

Std. 

Week 


Error 

Dev of 

Error 

Dev 

Error 

Dev of 

Error 

Dev of 




Error 


Of 


Error 


Error 






Error 






10 

0.85 

1.23 

0.39 

0.77 

0.23 

0.25 

0.35 

0.31 


20 

0.81 

1.15 

0.37 

0.64 

0.25 

0.27 

0.34 

0.32 


30 

0.82 

1.20 

0.47 

0.78 

0.26 

0.29 

0.37 

0.37 


40 

0.80 

1.19 

0.47 

0.77 

0.27 

0.31 

0.37 

0.37 


50 

0.90 

1.33 

0.47 

0.75 

0.26 

0.27 

0.35 

0.33 


60 

0.94 

1.28 

0.48 

0.78 

0.28 

0.31 

0.36 

0.32 


70 

0.94 

1.29 

0.51 

0.84 

0.29 

0.32 

0.38 

0.36 


80 

1.08 

1.45 

0.61 

0.96 

0.29 

0.33 

0.38 

0.38 


90 

1.11 

1.46 

0.68 

1.06 

0.31 

0.38 

0.40 

0.37 

3 rd 


Mean 

Std. 

Mean 

Std. 

Mean 

Std. 

Mean 

Std. 

Week 


Error 

Dev of 

Error 

Dev 

Error 

Dev of 

Error 

Dev of 




Error 


Of 


Error 


Error 






Error 






10 

0.76 

1.04 

0.42 

0.56 

0.28 

0.33 

0.31 

0.28 


20 

0.80 

1.25 

0.50 

0.77 

0.28 

0.29 

0.32 

0.29 


30 

0.79 

1.25 

0.54 

0.91 

0.31 

0.34 

0.33 

0.30 


40 

0.82 

1.19 

0.52 

0.80 

0.32 

0.37 

0.32 

0.29 


50 

0.83 

1.18 

0.55 

0.93 

0.31 

0.33 

0.32 

0.32 


60 

0.86 

1.26 

0.52 

0.75 

0.32 

0.36 

0.33 

0.32 


70 

0.94 

1.32 

0.58 

0.90 

0.32 

0.34 

0.33 

0.33 


80 

0.98 

1.30 

0.60 

0.88 

0.35 

0.38 

0.34 

0.35 


90 

1.09 

1.43 

0.73 

1.16 

0.39 

0.44 

0.39 

0.38 


Error = (Interpolated Uncertainty Metric - Actual Uncertainty Metric)/Actual Uncertainty Metric 

*Mean Error= Mean of Error (as defined above) over non-GV grid boxes 

"Std. Dev of Error = Standard Deviation of Error (as defined above) over non-GV grid boxes 
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630 

631 

632 


Table A.l. Contingency Table (a HIT is defined when both satellite and GV agree on the type of 
event detected; a MISS is when there is disagreement between satellite and GV detected events). 
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